Journal of General Virology (1994), 75, 1789-1794. Printed in Great Britain 


1789 


Nucleotide sequence and expression of the spike (S) gene of canine 
coronavirus and comparison with the S proteins of feline and porcine 
coronaviruses 

John G. Wesseling,t Harry Vennema, Gert-Jan Godeke, Marian C. Horzinek 
and Peter J. M. Rottier* 

Virology Division, Department of Infectious Diseases and Immunology, Veterinary Faculty, Utrecht University, 
P.O. Box 80.165, 3508 TD Utrecht, The Netherlands 


We have cloned, sequenced and expressed the spike (S) 
gene of canine coronavirus (CCV; strain K378). Its 
deduced amino acid sequence has revealed features in 
common with other coronavirus S proteins: a stretch of 
hydrophobic amino acids at the amino terminus (the 
putative signal sequence), another hydrophobic region 
at the carboxy terminus (the membrane anchor), heptad 
repeats preceding the anchor, and a cysteine-rich region 
located just downstream from it. Like other represent¬ 
atives of the same antigenic cluster (CCV-Insavc-1 
strain, feline infectious peritonitis and enteric corona- 
viruses, porcine transmissible gastroenteritis and res¬ 
piratory coronaviruses, and the human coronavirus 
HCY 229E), the CCV S polypeptide lacks a proteolytic 
cleavage site present in many other coronavirus S 
proteins. Pairwise comparisons of the S amino acid 
sequences within the antigenic cluster demonstrated that 
the two CCV strains (K378 and lnsavc-1) are 93-3 % 
identical, about as similar to each other as they are to the 


two feline coronaviruses. The porcine sequences are 
clearly more divergent mainly due to the large differences 
in the amino-terminal (residues 1 to 300) domains of the 
proteins; when only the carboxy-terminal parts (residues 
301 and on) are considered the homologies between the 
canine, feline and porcine S polypeptides are generally 
quite high, with identities ranging from 90-8 % to 96 8 %. 
The human coronavirus is less related to the other 
members of the antigenic group. A phylogenetic tree 
constructed on the basis of the S sequences showed that 
the two CCVs are evolutionarily more related to the 
feline than to the porcine viruses. Expression of the CCV 
S gene using the vaccinia virus T7 RNA polymerase 
system yielded a protein of the expected Af r (approxi¬ 
mately 200K) which could be immunoprecipitated with 
an anti-feline infectious peritonitis virus polyclonal 
serum and which was indistinguishable from the S 
protein synthesized in CCV-infected cells. 


Coronaviruses are large, enveloped, positive-stranded 
RNA viruses that cause respiratory, enteric and general¬ 
ized disease in humans and domestic animals. Canine 
coronavirus (CCV) was first isolated from the faecal 
specimens of American military dogs with diarrhoeal 
disease (Binn et al ., 1974). It infects dogs of any breed or 
age, causing depression, anorexia, vomiting and di¬ 
arrhoea in young animals. The dogs generally recover 
spontaneously 7 to 10 days after infection, but the 
diarrhoea may persist for more than 2 weeks. Death may 


t Present address: Department of Endocrinology and Reproduc¬ 
tion, Medical Faculty, Erasmus University of Rotterdam, Rotterdam, 
The Netherlands. 

The nucleotide sequence data presented in this paper have been 
submitted to the EMBL database and assigned the accession number 
X77047. 


occur 1 to 3 days after the onset of disease, especially in 
young pups (Carmichael & Binn, 1981). The virus 
replicates in the enterocytes of the small intestine and has 
been found in the intestinal lymph nodes. Vaccines to 
protect against CCV disease are beginning to appear on 
the market. Parenteral inoculation of dogs with CCV 
(either attenuated or not) did not result in disease, but 
the animals were not protected against oral challenge 
(Carmichael & Binn, 1981). 

Feline infectious peritonitis virus (FIPV) and feline 
enteric coronavirus (FECV), transmissible gastroenteritis 
virus of swine (TGEV), porcine respiratory coronavirus 
(PRCV) and CCV possess common antigenic deter¬ 
minants localized on the three major virion proteins 
(Horzinek et al., 1982; Sanchez et al., 1990). A human 
coronavirus (HCV; strain 229E) also showed cross¬ 
reactivity at the level of the nucleocapsid protein 
(Horzinek et ah, 1982), which led Siddell et al. (1983) to 
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Fig. 1. For legend see opposite. 


propose that these coronaviruses belong to one antigenic 
cluster. However, the cross-reaction was not reproduced 
in a more extensive study (Sanchez et al., 1990). The 
CCV virion RNA potentially encodes nine protein 
species: the 160K spike (S) glycoprotein, the 30K 
membrane protein (M), the 43K nucleocapsid protein 


(N), the 9K small membrane protein (SM) and five non- 
structural polypeptides [designated lb, 3a, 4, 7a and 7b 
(Horsburgh et al ., 1992) or 6a and 6b (Vennema et al., 
1992)]. 

CCV is the least characterized virus from this antigenic 
group. Because the S protein of coronaviruses is generally 
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Fig. 1. Amino acid sequence comparisons and evolutionary tree based 
on the S proteins of dog, cat, swine and human coronaviruses. (a) 
Comparison of the predicted amino acid sequence of the S protein of 
CCV strain K378 with those from CCV strain Insavc-1 (Horsburgh el 
al., 1992), from the feline coronaviruses FECV strain 79-1683 (accession 
Q25539 Geneseq Database) and FIPV strain 79-1146 (de Groot et al., 
1987 a), from the porcine coronaviruses TGEV (Jacobs et al., 1987) and 
PRCV (Rasschaert et al., 1990), and from the human coronavirus HCV 
229E (Raabe et al., 1990). Residues not shown are identical to those of 
CCV K378. Dashes have been introduced to obtain optimal alignment. 
The putative amino-terminal signal sequence is printed in bold and the 
putative membrane anchor (= = = =) is underlined. Potential N- 
glycosylation sites N) and the cysteine-rich (C) region are indicated. 
Consensus in the sequence is symbolized by an asterisk (complete 
identity) or a dot (conserved amino acid changes); a blank space 
indicates the absence of conservation, (b) Reconstruction of a 
phylogenetic tree of the coronaviruses based on the S amino acid 
sequences. For the calculation of distances between different S protein 
sequences a ‘Unity Distance’ table was used by 'Homologies/ 
Distances’. The gap penalty was set to the highest mismatch value in 
the matrix (10). 


considered to be a candidate antigen for the development 
of recombinant DNA-based vaccines, we present here 
the cloning and sequencing of the gene encoding the S 
protein of CCV strain K378. We compared this sequence 


with that of the CCV strain Insavc-1 (Horsburgh et al., 
1992) and with other known S sequences of the antigenic 
cluster to determine their evolutionary relationships. The 
S gene was then expressed using the vaccinia virus T7 
RNA polymerase system. 

The CCV strain K378 (Barlough et al., 1984; obtained 
from Dr H. Flore, Solvay-Duphar, Weesp, The Nether¬ 
lands) was grown in Fells catus whole fetus (fcwf-D) cells 
(obtained from Dr N. C. Pedersen). RNA was isolated 
from virus-infected cells and poly(A) + RNA was used as 
the template for cDNA synthesis (Gubler & Hoffman, 
1983) using oligo(dT) and random hexamer primers 
(Pharmacia). Poly(dC)-tailed cDNA was annealed to a 
PsvI-digested, dG-tailed pUC9 plasmid (Pharmacia) and 
the recombinants were used to transform Escherichia coli 
strain PC2495. Ampicillin-resistant colonies were trans¬ 
ferred onto nitrocellulose filters, lysed in situ (Sambrook 
et al., 1989) and hybridized with the 32 P-labelled S gene 
of FIPV (de Groot et al., 1987a). Plasmid DNA was 
obtained from 35 positive clones (Birnboim & Doly, 
1979), and the inserts from these clones were analysed by 
restriction enzyme mapping and Southern blotting, using 
parts of the FIPV S gene as probes (results not shown). 
Three overlapping cDNA clones were obtained that 
covered the entire CCV S coding region. These were 
sequenced using the M13 dideoxynucleotide chain 
termination procedure (Sanger et al., 1977). The nucleo¬ 
tide sequences were assembled and analysed with the aid 
of the computer programs of Devereux et al. (1984). 

The nucleotide sequence obtained contains a single 
open reading frame (ORF) of 4359 nucleotides and has 
a coding capacity for a 1453 amino acid polypeptide of 
predicted M r 160K. Twenty-seven nucleotides before the 
start of the S ORF a CTAAAC sequence is found, which 
serves as the minimal conserved signal for transcription, 
as has also been observed for FECV, FIPV, TGEV, 
PRCV and CCV Insavc-1 (Horsburgh etal., 1992; Spaan 
et al., 1988). The predicted S protein (Fig. la) contains 
31 potential V-glycosylation sites (Asn-X-Thr or Asn- 
X-Ser). Assuming a mean contribution to the total M r of 
2TK per carbohydrate chain (Hunter et al., 1983), the 
mature S glycoprotein would be approximately 220K. 
This value is slightly larger than the apparent M T of CCV 
S (see Fig. 2) which may indicate that not all potential 
glycosylation sites are used. The CCV S protein has a 
number of other features typical of coronavirus S 
proteins. First, at the amino terminus of the polypeptide 
a stretch of 20 mainly hydrophobic amino acids 
represents the putative signal sequence which is probably 
cleaved between Cys-18 and Thr-19 as this is the 
predicted signal peptidase recognition site (von Heijne, 
1986). Second, at the carboxy terminus (residues 1395 to 
1415) a second hydrophobic region is observed which 
probably serves as the membrane anchor (de Groot et 
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Fig. 2. Expression of the CCV S protein. HeLa cells were infected with 
the recombinant vaccinia virus vTF7-3 which produces the T7 
polymerase (Fuerst et at., 1986) and were transfected 1 h later with 
pTUGCCVS (lane 4), pTUGFIPVS (lane 6), pTUG3 (lane 2) or were 
mock-transfected (lane 1). At 16 h post-infection (p.i.) the cells were 
labelled for 1 h with 100 pCi/ml L-[ :!5 S]methionine in methionine-free 
medium. For comparison, CCV- and FlPV-infected fcwf-D cells were 
labelled similarly for 1 h starting at 6 h p.i. and subsequently incubated 
in chase medium for 1 h (lanes 3 and 5, respectively). Cell lysates were 
prepared and immunoprecipitations carried out using ascitic fluid 
(A36) derived from an FlPV-infected cat according to Vennema et at. 
(1990a). Numbers at the left indicate the positions of marker proteins 
run in the same gel; the arrow at the right indicates the position of the 
S proteins. 


al., 1987a) and is followed by a cysteine-rich region. 
Furthermore, the C-terminal part contains two regions 
with heptad repeat periodicity (residues 1068 to 1150 and 
1336 to 1381) which were proposed to be essential 
elements for the formation of the elongated stem 
structure of this peplomer-forming protein (de Groot et 
al., 19876). Finally, the CCV S protein does not contain 
any basic amino acid sequences related to the motifs 
RRXRR or RRAHRR (where X is F, S, H or A) which 
are the sites at which mouse hepatitis virus (MHV) and 
infectious bronchitis virus (IBV) S proteins are proteo- 


Table 1. Sequence comparisons of S proteins within the 
CCV antigenic cluster (identical and similar amino acid 
residues )* 



K378 

Insavc-1 

FECV 

FIPV 

TGEV 

PRCV 

HCV 

K378 

1000 

93-3 

93-2 

92-4 

82-8 

89-6 

51-7 

Insavc-1 

95-2 

100-0 

921 

91-4 

81-9 

88-5 

51-6 

FECV 

960 

95-0 

100.0 

95-3 

84 1 

90-0 

520 

FIPV 

95-5 

94-8 

97-2 

1000 

83-4 

90-0 

52-7 

TGEV 

89-3 

88-5 

90-2 

900 

1000 

95-5 

52-0 

PRCV 

93-5 

92-6 

940 

93-9 

97-3 

1000 

52-8 

HCV 

70-8 

70-4 

70-9 

7F5 

71-5 

72-1 

1000 


* Percentages of identical (right above 100 % diagonal) and similar 
(left below 100% diagonal) amino acid residues calculated from 
pairwise sequence alignments of the different S proteins (UWGCG gap 
program using gap penalty of 100 and gap length weight of 01). 

lytically cleaved to yield the SI and S2 polypeptides 
(Spaan et al., 1988). Like the FIPV and TGEV S 
proteins, the CCV S protein is probably not cleaved, 
which is consistent with experimental data (see Fig. 2). 
However CCV (like FIPV) is capable of inducing cell 
fusion in feline and canine cells, which shows that protein 
cleavage is not required for cell fusion activity. This 
conclusion is in agreement with recent results reported 
for MHV S which is normally cleaved. When cleavage 
was prevented by mutation of the cleavage site, this did 
not abolish the fusion potential of the expressed protein 
(Stauber et al., 1993; Taguchi, 1993). Consistent with 
this observation, cleavage site mutants of MHV isolated 
from persistently infected cells had also retained their cell 
fusing capacity, albeit to a much lower extent (Gombold 
et al., 1993). 

The evolutionary relationships and conserved or 
variable structural features were analysed using a 
computer-based comparison of the CCV K378 spike 
protein sequence with those of the S proteins of CCV 
Insavc-1 (Horsburgh et al., 1992), FECV 79-1683 
(accession Q25539 Geneseq Database), FIPV 79-1146 
(de Groot et al., 1987 a), PRCV (Rasschaert et al., 1990), 
TGEV (Jacobs et al., 1987) and HCV 229E (Raabe et al., 
1990); the results are shown in Fig. 1 (a) and Table 1. The 
S sequences of the two CCV strains have 94 amino acid 
differences (93-3 % identity) in addition to a one amino 
acid deletion in the Insavc-1 strain (Ser at position 797). 
They appear to differ about as much from each other as 
they do from the feline coronaviruses. The similarities of 
the two CCV S sequences with those of FIPV and FECV 
(91-4% to 93-2 % identity) are much higher than with the 
two swine coronaviruses PRCV and TGEV (81-9% to 
89-6 % identity). On the basis of its S sequence, HCV is 
clearly less related to the other members of this antigenic 
cluster (51-6% to 52-8% identity); clearly this virus has 
evolved differently (see also Fig. 1 b). 

It has been noted before (de Groot et al., 1987a; 
Luytjes et al., 1987; Raabe et al., 1990) that most 
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Table 2. Sequence comparisons of N- and C-terminal 
portions of S proteins* 



K.378 

Insavc-1 

FECV 

FIPV 

TGEV 

K378 

1000 

87-0 

83-6 

81-2 

37-8 

Insavc-1 

950 

1000 

82-2 

80-8 

38-8 

FECV 

95-7 

94-6 

1000 

89-6 

41-2 

FIPV 

95-3 

94-1 

96-8 

1000 

38-3 

TGEV 

93-6 

92-2 

94-3 

94-1 

1000 

PRCV 

920 

90-8 

92-5 

92-8 

96-5 

HCV 

51-5 

51-2 

51-5 

52-1 

51-5 


* Percentages of identical amino acid residues of different S proteins 
calculated from pairwise sequence alignments of residues 1 to 300 (right 
above 100% diagonal) and starting from residue 301 (left below 100% 
diagonal), using the UWGCG gap program with gap penalty 100 and 
gap length weight 0-1. 


variations in the coronavirus S proteins are found in the 
amino-terminal domain. Our sequence comparisons 
confirm this point. Considering the data illustrated in 
Fig. 1 (a) the S proteins can be divided into a moderately 
conserved amino-terminal domain and a highly con¬ 
served carboxy-terminal domain. This distinction is 
shown in Table 2 where the identities between the S 
proteins have been calculated separately for the amino- 
terminal 300 residues and for the remainder of the 
polypeptides. In the amino-terminal domain the highest 
identity scores are observed for the two canine viruses 
(87-0 %) and for the two feline viruses (89-6 %). Pairwise 
comparisons of the canine and feline sequences with each 
other yielded identities varying from 80-8 to 83-6%. The 
TGEV S protein is considerably different from the canine 
and feline S sequences in its amino-terminal domain 
(from 37-8 to 41-2% identity). In contrast, in its carboxy- 
terminal part TGEV S protein can be considered to be 
very closely related to these same canine and feline 
viruses, with identities amounting to 92-2 to 94-3%. 
These figures are only slightly lower than those obtained 
when the feline and canine sequences are mutually 
compared in this region (94T to 96-8%). Interestingly, 
the K378 strain of CCV is almost as closely related to 
FIPV and FECV as it is to the Insavc-1 CCV strain in the 
carboxy-terminal region of the S protein. When per¬ 
centage similarities are compared over the whole protein 
(Table 1), the K378 strain appears to be even more 
similar to the feline viruses than to the other CCV strain. 
This pattern of sequence similarities may reflect the 
geographical origin of these corona viruses: the two feline 
coronaviruses and CCV K378 originate from the United 
States whereas CCV Insavc-1 is a British isolate. 

To construct a phylogenetic tree from the comparative 
data (Fig. la) we have used a program that compares 
sequences on the basis of distance matrix files using the 
neighbour-joining method described by Saitou & Nei 
(1987). It appears from the resulting tree (Fig. lb) that 


the canine coronaviruses are evolutionarily more closely 
related to the feline than to the porcine coronaviruses. 
The human coronavirus HCV 229E is more distant 
within the antigenic cluster. IBV-M41 and MHV-A59 
are most distantly related (data not shown), a finding 
that is in keeping with the fact that these coronaviruses 
belong to a separate antigenic cluster (Siddell et al., 
1983). 

In order to express the CCV S ORF, the gene was 
assembled from three overlapping cDNA clones. After 
removal of the 5' non-coding region using a PCR 
strategy (not shown), the coding region was cloned into 
the Pstl and EcoRl sites of the polylinker of pBlueScript 
KS - . From this plasmid a Bamlil/EcoRl fragment was 
cloned into the vaccinia virus T7 expression vector 
pTUG3 (Vennema et al., 1991). In parallel, a Raw HI 
fragment encoding the FIPV S protein (de Groot et al., 
1987a; Vennema et al., 1990 a) was also cloned into 
pTUG3. The resulting constructs pTUGCCVS and 
pTUGFIPVS were used to express the respective S 
proteins. HeLa cells infected with the recombinant 
vaccinia virus vTF7-3, which produces T7 RNA poly¬ 
merase (Fuerst et al., 1986) were transfected with the 
plasmids and subsequently labelled with [ 35 S]methionine 
(Amersham). The expressed products were analysed 
electrophoretically (Laemmli, 1970) after immuno- 
precipitation using a polyclonal FlPV-specific antiserum. 

As shown in Fig. 2, the CCV S construct specifically 
induced the synthesis of a protein of M r approximately 
200K (lane 4), which is close to the expected M v of the 
glycosylated spike polypeptide (see above). No such 
protein was detected after transfection with pTUG3 
alone (lane 2) or after mock transfection (lane 1). The 
product comigrated with the S proteins synthesized in 
cells infected with CCV (lane 3) or FIPV (lane 5). The 
CCV expression product also comigrated with the FIPV 
S gene product (lane 6), in agreement with their similar 
predicted M T s. Collectively, these results prove that the 
ORF cloned, sequenced and reconstructed indeed spec¬ 
ifies the CCV S protein. 

CCV causes gastroenteric disease in dogs, often 
resulting in death of the more susceptible younger 
animals (Carmichael & Binn, 1981). Several conventional 
vaccines have been tested but none induced long-lasting 
protection. With the notorious exception of FIPV, 
antibodies to the S protein of which can enhance the 
infection process (Vennema et al., 1990 b), the S protein 
of coronaviruses generally appears to be the prime 
candidate to be the basis for a vaccine (Spaan et al., 
1990). The bona fide expression of the CCV S gene 
reported in this paper may therefore provide the basis for 
the development of a recombinant vaccine which, when 
suitably presented, e.g. through an adenovirus carrier, 
may induce protection. 
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